Clearwater
Enhancing Large Language Models for End-to-End Circuit Analysis Problem Solving
Chen, Liangliang, Sun, Weiyu, Zhang, Ying
Large language models (LLMs) have shown strong performance in data-rich domains such as programming, but their reliability in engineering tasks remains limited. Circuit analysis -- requiring multimodal understanding and precise mathematical reasoning -- highlights these challenges. Although Gemini 2.5 Pro improves diagram interpretation and analog-circuit reasoning, it still struggles to consistently produce correct solutions when given both text and circuit diagrams. At the same time, engineering education needs scalable AI tools capable of generating accurate solutions for tasks such as automated homework feedback and question-answering. This paper presents an enhanced, end-to-end circuit problem solver built on Gemini 2.5 Pro. We first benchmark Gemini on a representative set of undergraduate circuit problems and identify two major failure modes: 1) circuit-recognition hallucinations, particularly incorrect source polarity detection, and 2) reasoning-process hallucinations, such as incorrect current directions. To address recognition errors, we integrate a fine-tuned YOLO detector and OpenCV processing to isolate voltage and current sources, enabling Gemini to re-identify source polarities from cropped images with near-perfect accuracy. To reduce reasoning errors, we introduce an ngspice-based verification loop in which Gemini generates a .cir file, ngspice simulates the circuit, and discrepancies trigger iterative regeneration with optional human-in-the-loop review. Across 83 problems, the proposed pipeline achieves a 97.59% success rate (81 correct solutions), substantially outperforming Gemini 2.5 Pro's original 79.52% accuracy. This system extends LLM capabilities for multimodal engineering problem-solving and supports the creation of high-quality educational datasets and AI-powered instructional tools.
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- Europe > Italy > Sicily > Palermo (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Workflow (0.93)
- Instructional Material > Course Syllabus & Notes (0.67)
- Education > Curriculum > Subject-Specific Education (0.69)
- Education > Educational Technology > Educational Software > Computer Based Training (0.46)
- Education > Educational Setting > Higher Education (0.46)
Deep-Learning-Based Pre-Layout Parasitic Capacitance Prediction on SRAM Designs
Shen, Shan, Yang, Dingcheng, Xie, Yuyang, Pei, Chunyan, Yu, Wenjian, Yu, Bei
To achieve higher system energy efficiency, SRAM in SoCs is often customized. The parasitic effects cause notable discrepancies between pre-layout and post-layout circuit simulations, leading to difficulty in converging design parameters and excessive design iterations. Is it possible to well predict the parasitics based on the pre-layout circuit, so as to perform parasitic-aware pre-layout simulation? In this work, we propose a deep-learning-based 2-stage model to accurately predict these parasitics in pre-layout stages. The model combines a Graph Neural Network (GNN) classifier and Multi-Layer Perceptron (MLP) regressors, effectively managing class imbalance of the net parasitics in SRAM circuits. We also employ Focal Loss to mitigate the impact of abundant internal net samples and integrate subcircuit information into the graph to abstract the hierarchical structure of schematics. Experiments on 4 real SRAM designs show that our approach not only surpasses the state-of-the-art model in parasitic prediction by a maximum of 19X reduction of error but also significantly boosts the simulation process by up to 598X speedup.
- Asia > China > Hong Kong (0.04)
- North America > United States > Florida > Pinellas County > Clearwater (0.04)
- Europe > Greece (0.04)
- Asia > China > Beijing > Beijing (0.04)
From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction
Acar, Ayberk, Smith, Mariana, Al-Zogbi, Lidia, Watts, Tanner, Li, Fangjie, Li, Hao, Yilmaz, Nural, Scheikl, Paul Maria, d'Almeida, Jesse F., Sharma, Susheela, Branscombe, Lauren, Ertop, Tayfun Efe, Webster, Robert J. III, Oguz, Ipek, Kuntz, Alan, Krieger, Axel, Wu, Jie Ying
Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- North America > United States > Tennessee > Davidson County > Nashville (0.05)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (8 more...)
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.94)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems
Angioli, Marco, Barbirotta, Marcello, Cheikh, Abdallah, Mastrandrea, Antonio, Menichelli, Francesco, Olivieri, Mauro
As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. This paper presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman-Morrison-Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in execution time and energy consumption, demonstrating the effectiveness of combining algorithmic and hardware optimizations to enhance learning models for edge computing environments with low-power and real-time requirements.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Lazio > Rome (0.04)
- South America > Brazil (0.04)
- (9 more...)
- Information Technology (1.00)
- Energy (0.87)
Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data
Das, Sudeshna, Ge, Yao, Guo, Yuting, Rajwal, Swati, Hairston, JaMor, Powell, Jeanne, Walker, Drew, Peddireddy, Snigdha, Lakamana, Sahithi, Bozkurt, Selen, Reyna, Matthew, Sameni, Reza, Xiao, Yunyu, Kim, Sangmi, Chandler, Rasheeta, Hernandez, Natalie, Mowery, Danielle, Wightman, Rachel, Love, Jennifer, Spadaro, Anthony, Perrone, Jeanmarie, Sarker, Abeed
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Research Report > New Finding (0.69)
- Research Report > Experimental Study (0.48)
GeckOpt: LLM System Efficiency via Intent-Based Tool Selection
Fore, Michael, Singh, Simranjit, Stamoulis, Dimitrios
In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By identifying the intent behind user prompts at runtime, we narrow down the API toolset required for task execution, reducing token consumption by up to 24.6\%. Early results on a real-world, massively parallel Copilot platform with over 100 GPT-4-Turbo nodes show cost reductions and potential towards improving LLM-based system efficiency.
- North America > United States > Florida > Pinellas County > Clearwater (0.06)
- North America > United States > Washington > King County > Redmond (0.05)
- North America > United States > Virginia > Fairfax County > Reston (0.05)
- (2 more...)
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control
Zheng, Longtao, Wang, Rundong, Wang, Xinrun, An, Bo
Building agents with large language models (LLMs) for computer control is a burgeoning research area, where the agent receives computer states and performs actions to complete complex tasks. Previous computer agents have demonstrated the benefits of in-context learning (ICL); however, their performance is hindered by several issues. First, the limited context length of LLMs and complex computer states restrict the number of exemplars, as a single webpage can consume the entire context. Second, the exemplars in current methods, such as high-level plans and multi-choice questions, cannot represent complete trajectories, leading to suboptimal performance in long-horizon tasks. Third, existing computer agents rely on task-specific exemplars and overlook the similarity among tasks, resulting in poor generalization to novel tasks. To address these challenges, we introduce Synapse, a computer agent featuring three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions to improve multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks. We evaluate Synapse on MiniWoB++, a standard task suite, and Mind2Web, a real-world website benchmark. In MiniWoB++, Synapse achieves a 99.2% average success rate (a 10% relative improvement) across 64 tasks using demonstrations from only 48 tasks. Notably, Synapse is the first ICL method to solve the book-flight task in MiniWoB++. Synapse also exhibits a 56% relative improvement in average step success rate over the previous state-of-the-art prompting scheme in Mind2Web.
- North America > United States > Connecticut > Hartford County > Hartford (0.04)
- North America > United States > New York > Suffolk County > Islip (0.04)
- North America > United States > Texas > Taylor County > Abilene (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report (0.63)
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph
Sun, Jiashuo, Xu, Chengjin, Tang, Lumingyuan, Wang, Saizhuo, Lin, Chen, Gong, Yeyun, Ni, Lionel M., Shum, Heung-Yeung, Guo, Jian
Although large language models (LLMs) have achieved significant success in various tasks, they often struggle with hallucination problems, especially in scenarios requiring deep and responsible reasoning. These issues could be partially addressed by introducing external knowledge graphs (KG) in LLM reasoning. In this paper, we propose a new LLM-KG integrating paradigm ``$\hbox{LLM}\otimes\hbox{KG}$'' which treats the LLM as an agent to interactively explore related entities and relations on KGs and perform reasoning based on the retrieved knowledge. We further implement this paradigm by introducing a new approach called Think-on-Graph (ToG), in which the LLM agent iteratively executes beam search on KG, discovers the most promising reasoning paths, and returns the most likely reasoning results. We use a number of well-designed experiments to examine and illustrate the following advantages of ToG: 1) compared with LLMs, ToG has better deep reasoning power; 2) ToG has the ability of knowledge traceability and knowledge correctability by leveraging LLMs reasoning and expert feedback; 3) ToG provides a flexible plug-and-play framework for different LLMs, KGs and prompting strategies without any additional training cost; 4) the performance of ToG with small LLM models could exceed large LLM such as GPT-4 in certain scenarios and this reduces the cost of LLM deployment and application. As a training-free method with lower computational cost and better generality, ToG achieves overall SOTA in 6 out of 9 datasets where most previous SOTAs rely on additional training.
- North America > United States > Washington > King County > Seattle (0.28)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California (0.14)
- (30 more...)
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Kim, Seungone, Joo, Se June, Kim, Doyoung, Jang, Joel, Ye, Seonghyeon, Shin, Jamin, Seo, Minjoon
Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this work, we aim to equip smaller LMs with the step-by-step reasoning capability by instruction tuning with CoT rationales. In order to achieve this goal, we first introduce a new instruction-tuning dataset called the CoT Collection, which augments the existing Flan Collection (including only 9 CoT tasks) with additional 1.84 million rationales across 1,060 tasks. We show that CoT fine-tuning Flan-T5 (3B & 11B) with CoT Collection enables smaller LMs to have better CoT capabilities on unseen tasks. On the BIG-Bench-Hard (BBH) benchmark, we report an average improvement of +4.34% (Flan-T5 3B) and +2.60% (Flan-T5 11B), in terms of zero-shot task accuracy. Furthermore, we show that instruction tuning with CoT Collection allows LMs to possess stronger few-shot learning capabilities on 4 domain-specific tasks, resulting in an improvement of +2.24% (Flan-T5 3B) and +2.37% (Flan-T5 11B), even outperforming ChatGPT utilizing demonstrations until the max length by a +13.98% margin. Our code, the CoT Collection data, and model checkpoints are publicly available.
- North America > United States > Florida > Pinellas County > Clearwater (0.04)
- Europe > United Kingdom > Scotland (0.04)
- Research Report (0.82)
- Personal > Interview (0.46)
- Education (0.93)
- Leisure & Entertainment (0.68)
- Law (0.68)
- (2 more...)
Towards Flexibility and Interpretability of Gaussian Process State-Space Model
Lin, Zhid, Yin, Feng, Maroñas, Juan
The Gaussian process state-space model (GPSSM) has garnered considerable attention over the past decade. However, the standard GP with a preliminary kernel, such as the squared exponential kernel or Mat\'{e}rn kernel, that is commonly used in GPSSM studies, limits the model's representation power and substantially restricts its applicability to complex scenarios. To address this issue, we propose a new class of probabilistic state-space models called TGPSSMs, which leverage a parametric normalizing flow to enrich the GP priors in the standard GPSSM, enabling greater flexibility and expressivity. Additionally, we present a scalable variational inference algorithm that offers a flexible and optimal structure for the variational distribution of latent states. The proposed algorithm is interpretable and computationally efficient due to the sparse GP representation and the bijective nature of normalizing flow. Moreover, we incorporate a constrained optimization framework into the algorithm to enhance the state-space representation capabilities and optimize the hyperparameters, leading to superior learning and inference performance. Experimental results on synthetic and real datasets corroborate that the proposed TGPSSM outperforms several state-of-the-art methods. The accompanying source code is available at \url{https://github.com/zhidilin/TGPSSM}.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (20 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)